Campaign group noyb takes on Meta over AI training

Meta intends to train its Large Language Model (LLM), “Llama”, using Facebook and Instagram users’ public posts on an “opt-out” basis. The company claims its approach is justified by an opinion from the European Data Protection Board (EDPB).

Privacy campaign group noyb disagrees, alleging that Meta’s plans would violate the General Data Protection Regulation (GDPR).

The case tackles one of the hardest problems at the intersection of data protection and artificial intelligence: Can controllers train AI models on people’s personal data without their consent?

Meta’s AI training plan

Meta announced via a news post on April 14 that it would “soon” begin training its AI models on:

“The interactions that people have with AI at Meta,” like “questions and queries”, and
“Public content shared by adults on Meta Products,” such as “public posts and comments.”

Meta said it would provide EU-based users with in-app and email notifications to explain “the kind of data we’re going to start using, how this will improve AI at Meta and the overall user experience.”

Meta also said it would provide “a form where people can object to their data being used in this way at any time.”

While the post did not mention Meta’s intended legal basis for processing personal data in this way, it cited a European Data Protection Board (EDPB) Opinion from December—presumably Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models.

“We welcome the opinion provided by the EDPB in December, which affirmed that our original approach met our legal obligations,” Meta said.

Does the EDPB support Meta’s approach?

The EDPB adopted Opinion 28/2024 in response to several questions about the GDPR and AI training from the Irish Data Protection Authority (DPA).

Among other questions, the Irish DPA asked how controllers can “demonstrate the appropriateness” of relying on “legitimate interests” for the development and deployment of AI models.

Meta’s reliance on a transparency notice and opt-out form suggests that the company is planning to rely on legitimate interests to use personal data for AI training purposes.

Here are some points that the EDPB made that might have encouraged Meta to make this move:

There is “no hierarchy between the legal bases”—legitimate interest can, in theory, apply to most data processing activities if the controller satisfies the “three-part test”.
Data subjects’ reasonable expectations impact on the “balancing test”. If personal data is publicly available, using the data for AI training is more likely to meet data subjects’ reasonable expectations.
Controllers can also implement “mitigating measures” to tip the balance of interests in their favour. One such mitigating measure could be “creating an opt-out list, managed by the controller and which allows data subjects to object to the collection of their data.”
Another example of a mitigating measure is “public and easily accessible communications” to affected data subjects that “go beyond the information required under Article 13 or 14 GDPR.”

Noyb’s objections

Noyb provides a list of the following 11 legal objections to Meta’s plans:

Users would not reasonably expect their social media posts, which could be over 20 years old, to be used for AI training. In last year’s Bundeskartellamt case, the CJEU made a similar point about using data for targeted advertising.
Meta’s “mere commercial interests” cannot justify such large-scale processing of personal data, drawing on 2014’s Google Spain case, where the CJEU said Google’s legitimate interest in crawling public websites could be outweighed by individuals who requested the de-listing of personal data in search results.
Meta will be largely unable to comply with data subject rights requests once personal data has been “ingested” into its AI model, particularly once other controllers have used Meta’s open-source AI model post-training.
Meta is only allowing users to exercise their “right to object” before the processing occurs, whereas noyb argues that this right should also be available after the processing has occurred. Providing this allegedly limited version of a basic right should not be offered as a mitigating measure.
To the extent that Meta can comply with GDPR rights requests, it intends to only apply them to data directly linked to a user’s account. According to noyb, this means a user’s objection might not apply to a photo of them among multiple people hosted on another user’s account. Meta also appears not to have considered the rights of non-users whose personal data it controls.
In other contexts, Meta has previously stated that it cannot distinguish between types of users due to the nature of its social network, where many data points are “shared”. Noyb alleges that this means Meta will be unable to distinguish between users who have opted out and those who have not.
Noyb alleges that Meta’s AI training process will inevitably involve special category data, for which the company will require a separate condition under Article 9 GDPR. Noyb suggests that the only appropriate Article 9 condition is “explicit consent”.
Meta has not published a legitimate interests assessment, as noyb suggests it is obliged to.
Noyb argues that “public” Facebook and Instagram posts are not truly public, given Meta’s anti-scraping measures and the ultimately tiny audience for some posts.
Meta’s processing would violate the principles of “fairness” (due to users’ reasonable expectations), “purpose limitation” (by training a “general purpose” AI system), and “data minimization” (due to the scale of the processing).
Combining data from Facebook and Instagram without consent would allegedly violate the Digital Markets Act (DMA).

Noyb has asked Meta to provide evidence of its compliance or stop its AI training process by 21 May 2025.

AI uncertainty

Most of the world’s major generative AI companies have trained their models on the basis of “legitimate interests”. To train a Large Language Model (LLM) like Meta’s Llama or OpenAI’s GPT, obtaining valid consent from a sufficiently large cohort of people would be a challenge.

But smaller organizations can develop and deploy AI in a manner that is more clearly consistent with people’s rights.

Ensuring that you keep up-to-date with regulatory guidance on AI, such as the UK Information Commissioner’s Office (ICO) AI guidance, is a good first step.

Recognized by Gartner as a Privacy and Consent Management leader

Switch from OneTrust and get 12 months free

Recognized by Gartner as a Consent Management leader

Free RFP template for consent and preference management

Recognized by Gartner as a Privacy and Consent Management leader

Switch from OneTrust and get 12 months free

Recognized by Gartner as a Consent Management leader

Switch from OneTrust and get 12 months free

Campaign group noyb takes on Meta over AI training under ‘legitimate interests’

Meta’s AI training plan

Does the EDPB support Meta’s approach?

Noyb’s objections

AI uncertainty

You might also like to read:

What privacy experts are saying about AI: Warnings and opportunities

First Digital Markets Act enforcement cases: EU fines Apple and Meta

Campaign group noyb takes on Meta over AI training under ‘legitimate interests’

Meta’s AI training plan

Does the EDPB support Meta’s approach?

Noyb’s objections

AI uncertainty

A privacy professional's AI checklist

You might also like to read:

What privacy experts are saying about AI: Warnings and opportunities

First Digital Markets Act enforcement cases: EU fines Apple and Meta